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Abstract 

Purpose The analysis of uncertainty in life cycle assessment 
(LCA) studies has been a topic for more than 10 years, and 
many commercial LCA programs now feature a sampling 
approach called Monte Carlo analysis. Yet, a full Monte 
Carlo analysis of a large LCA system, for instance containing 
the 4,000 unit processes of ecoinvent v2.2, is rarely carried out 
by LCA practitioners. One reason for this is computation time. 
An alternative faster than Monte Carlo method is analytical 
error propagation by means of a Taylor series expansion; 
however, this approach suffers from being explained in the 
literature in conflicting ways, hampering implementation in 
most software packages for LCA. The purpose of this paper is 
to compare the two different approaches from a theoretical and 
practical perspective. 

Methods In this paper, we compare the analytical and sampling 
approaches in terms of their theoretical background and their 
mathematical formulation. Using three case studies—one styl¬ 
ized, one real-sized, and one input-output (lO)-based—we 
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approach these techniques from a practical perspective and 
compare them in terms of speed and results. 

Results Depending on the precise question, a sampling or an 
analytical approach provides more useful information. 
Whenever they provide the same indicators, an analytical 
approach is much faster but less reliable when the uncer¬ 
tainties are large. 

Conclusions For a good analysis, analytical and sampling 
approaches are equally important, and we recommend practi¬ 
tioners to use both whenever available, and we recommend 
software suppliers to implement both. 

Keywords Analytical methods ■ Gaussian error 
propagation • IOA • LCA • Monte Carlo • Sampling methods • 
Uncertainty 


1 Introduction 

Uncertainty is a pervasive topic in life cycle assessment 
(LCA). It is so in a fundamental sense: uncertainty is present 
in many forms in all stages of an LCA. It is also well presented 
in the recent scientific literature: the last few volumes of the 
International Journal of Life Cycle Assessment contain papers 
on uncertainty in LCA, either in recognizing that there is 
uncertainty or in presenting approaches to manage them. 
Review articles on uncertainty have been published by various 
authors (Heijungs and Huijbregts 2004; Lloyd and Ries 
2007). One of the most used life cycle inventory (LC1) data¬ 
bases—ecoinvent—contains specifications of parameter un¬ 
certainty for many coefficients of most unit processes. Most 
software for LCA is by now able to deal with uncertainties, in 
most cases on the basis of Monte Carlo simulations. A UNEP- 
SETAC-endorsed working group on uncertainties in LCA has 
been active for a few years. 
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Uncertainty shows up in LCA in many ways: 

• Input data (such as fuel consumption, CO 2 emissions, and 
characterisation factors) may be uncertain or conflicting 
due to inaccurate measurements, or they may be subject to 
variability from day to day or from source to source. 

• The LCA procedure requires a number of choices and 
assumptions (for instance, on system boundaries, conse¬ 
quential vs. attributional LCA, and the time horizon for 
global warming), and such choices are debatable. 

• Uncertain and variable data and different choices percolate 
through the LCA model via error propagation, leading to 
uncertain results. 

• Uncertain results may be interpreted by decision-makers 
in varying ways, depending on preferences, time, or fram¬ 
ing of the decision situation, amongst other factors. 

In any case, uncertainty needs to be documented in LCA 
studies, and the framework shown in Fig. 1 can be used for 
this purpose. 

This article addresses one specific issue pertaining to uncer¬ 
tainty in relation to LCA: the estimation of the uncertainty of 
LCA results using given uncertainty estimates of the LCA 
input parameters. This issue is represented by the step indicated 
by bold lines in Fig. 1 . The subject field behind this is known as 
the theory of error propagation, as it addresses the question 
how input uncertainties propagate into output uncertainties 
through the LCA model. In this paper, we will restrict our 
attention to LCA based on unit process data and LCA based on 
input-output (10) data. Customized, parameterized LCA 
models, in which for instance the fuel input and the emissions 
depend on a fuel efficiency parameter at a higher level, are not 
discussed in this paper. The chief purpose of this paper is to 
demonstrate to practitioners the principles, assumptions, and 
comparative advantages of the main methods for propagating 
parameter uncertainty to overall LCI uncertainty, under well- 
defined assumptions (i.e., no interaction between variables and 
relatively small and well-behaving uncertainties). 

Basically, two classes of techniques are available to study 
error propagation: sampling methods and analytical methods. 

Sampling methods are methods that address the problem 
by sampling from the probability distributions of the input 
parameters and re-calculating the LCA results for every mem¬ 
ber of this sample (Morgan and Henrion 1990). This produces 


Fig. 1 Framework for treating parameter and model uncertainty in LCA. 
The bold box represents the emphasis of this paper 


a sample of results, and from this sample, several statistics can 
be computed, such as the mean, the standard deviation, the 
median, 95 % confidence intervals, correlation coefficients, 
etc. The most well-known sampling method is Monte Carlo 
simulation. A more sophisticated method is the Latin hyper¬ 
cube method, where the sampling strategy is not entirely 
random but utilizes stratified probability distributions. Yet, 
another approach features a calculation run for every combi¬ 
nation of parameters in which each parameter assumes the 
highest and the lowest value. 

Sampling methods are conceptually easy to understand, 
and they are easy to implement in software. There exist 
commercial packages (such as Crystal Ball and @RISK) that 
can be used as a plug-in together with other software (such as 
Microsoft Excel). A brief survey suggests that many present- 
day LCA programs have implemented the Monte Carlo meth¬ 
od, which naturally means that most LCA studies that include 
an uncertainty analysis do this on the basis of Monte Carlo 
simulation (Lloyd and Ries 2007). 

Sampling methods, however, have also disadvantages. The 
number of simulations must be large enough to ensure 
sampling means are measured with small enough standard 
deviations. As a consequence, many Monte Carlo studies 
typically take 10,000 calculation runs. Using a Latin 
hypercube strategy may reduce this, but LCA programs do, 
as far as we know, not include this more advanced sampling 
method. For the conceptually simpler strategy of using all 
combinations of highest and lowest value, Heijungs (1996) 
has shown that the computation time required in LCAs may 
easily exceed the age of the universe. In a small LCA, the 
computation time for LCI and/or LCIA can typically be less 
than a second. Running 1,000 Monte Carlo simulations would 
then require a quarter of an hour. This is doable, although 
repeating this for many pollutants and/or many product alter¬ 
natives is not an attractive option. Running 10,000 simulations 
would require 3 h, and that already is more of a problem. 
Present-day LCA inventories tend to be large. As an illustra¬ 
tion, the 1996 version of the ETH energy database contains 
1,200 processes, ecoinvent vl from 2002 contains 2,500 
processes, ecoinvent v2 (2007) 4,000 processes, and the re¬ 
cently released version v3 about 10,000 processes. One pop¬ 
ular way of solving an LCI is the inversion of an A ,r x N square 
matrix, and that this is typically an operation requiring N 3 
single computational steps (Press et al. 1992). As a result, 
doubling a database in size implies an eightfold increase in 
computation time, all other things remaining equal. Indeed, 
whereas ecoinvent vl was shipped with the results of Monte 
Carlo calculations, no such calculations have been perfonned 
for some of the versions of ecoinvent v2. A smarter sampling 
strategy, for example using a Latin hypercube approach, may 
reduce the required number of simulations needed. And a 
smarter algorithm for computing an LCI (Peters 2007) may 
reduce the required effort as well. But yet, even when one run 
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takes 1 min, unacceptably large computation times are re¬ 
quired for day-to-day consultants’ purposes. 

An alternative approach to study the propagation of 
uncertainties is by using an analytical approach. Such 
an approach is more difficult to understand and requires 
more mathematical background, although once imple¬ 
mented into software, no mathematical expertise is re¬ 
quired from the LCA practitioner. The analytical ap¬ 
proach is based on calculus, applying a local derivative 
of the mathematical function that specifies how inputs 
are transformed into outputs. In contrast to a sampling 
method, it does not provide a probability distribution of 
the outputs. Rather, it only estimates the mean and the 
standard deviation (or its square, the variance). This, 
however, is sufficient for many purposes. 

An initial study (Heijungs et al. 2005) compared the per¬ 
formance of the sampling and analytical approaches and con¬ 
cluded the following: “When one Monte Carlo run takes 30 s, 
1,000 runs require a working day. The analytical approach can 
reduce this to a few minutes, while the results are basically the 
same” (p.111). This was, however, on one test system, using 
outdated software, and it was certainly not an exhaustive 
analysis of sampling vis-a-vis analytical methods. 

The present paper approaches the two methods in 
more detail. We discuss three different ways of elabo¬ 
rating the analytical approach: one as published by 
Heijungs (1994), one by Hong et al. (2010), and one 
on the basis of input-output analysis, while the section 
thereafter discusses the sampling method in more detail. 
Section 4 presents three case studies in which the ana¬ 
lytical method is compared to the sampling approach. 
Section 5 concludes the paper. 


2 Analytical approaches to error propagation in LCA 

In this section, we first discuss the theory of analytical error 
propagation (Section 2.1) and then introduce a number of 
alternative formulations for error propagation (Section 2.2), 
notably the analytical method. Finally, we present the analyt¬ 
ical expressions for the LCA model, according to different 
publications (Section 2.3). 


into account more of these terms, viz., by taking a longer 
expansion. The precise form is as follows: 


f{x) = f(a) 


dfja) 

dx 


d 2 fia), , 2 , d^fia), ^ 

( 1 ) 


This approximation does not hold for every function f, 
but it requires some assumptions (such as differentiability) 
that we will assume to hold in the cases we discuss 
hereafter. An example is the Taylor series expansion for 

sin(x) around the value a= 0: sin(x) = x^fj + fr-• For 

small values of x, we can thus use sin(x)=x, for somewhat 
larger values of x; we need to take account of the second 
term as well. In this way, we can use within a certain 
region a simple linear approximation for any complicated 
function. When we use the first term only, we speak of a 
first-order approximation, including the second term gives 
a second-order approximation, etc. 


2.1.2 Determining variance—an illustrative example 

To explain the analytical error propagation method, we first 
take a simple example. Suppose we want to calculate the area 
A of a rectangular sheet of paper with width w and height h. 
The formula to do this is simply 

A = w x h (2) 

Let us suppose that both w and /; are empirically detennined 
variables, which are not exactly known, but which are known to 
be specified according to probability distributions with vari¬ 
ances var(w) and var(/;), where var is the square of the standard 
deviation of the probability distribution, often written as .v 2 or 
a 2 . Further, we assume that the errors in w and h are indepen¬ 
dent, i.e., that the covariance of w and h is 0 (we discuss the 
plausibility and consequences of this assumption in the last 
section). It is now asked to specify the probability distribution 
of A and in particular its variance var(H). The theory of error 
propagation (Bevington and Robinson 1994; Morgan and 
Henrion 1990; Ayyub and Klir 2006) gives an answer to this: 


2.1 Theory of error propagation 


var [A)~h 2 x var(w) + w 2 x var(/z), 


( 3 ) 


2.1.1 Taylor series expansion 

Taylor’s theorem is an established part of mathematical anal¬ 
ysis (Apostol 1967). It asserts that we can calculate the value 
of a function / for a value x if we know it at another point a, 
using a number of terms that involves the derivatives of f with 
respect to x and using the distance between x and a. The 
approximation of f(x) can be made more precise by taking 


where the ~ indicates that it is an approximate value, as we 
have only used a first-order approximation and have neglected 
any possible correlation between the values of w and h. Thus, 
given the mean values of w and h and the variances of w and h, 
the mean and the variance of A can be computed using two 
simple formulas. 

This is a special and simple case. The more general form in 
the case of two input variables x and y is that of an output 
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variable z that is an arbitrary function of the two input 
variables, 

z = z(x,y). 

In that case, we have 

var(z K£) varw+ (I) varM+2 (S (I) cov(xij;) ’ (4) 

where we have now included the possibility that the errors of 
the variables x and y are correlated, expressed through the 
covariance (cov) ofx andj (see, e.g., Bevington and Robinson 
1994). It is now a task to find the expression for the function 
z(x, y ) in LCA. This will be discussed in the next section. 

2.1.3 Assumptions and definitions 

To study the analytical approach toward error propagation in 
LCA, we will first make several assumptions first. 

• We focus on the LCI phase. Extension of these ideas to 
cover characterization, normalization, weighting, etc. are 
straightforward, albeit they require hard work (Ciroth et al. 
2004). 

• We restrict the discussion to parameter uncertainty. Model 
uncertainty and other forms of uncertainty that may be 
involved in setting system boundaries, allocation, etc. are 
excluded here. 

• We consider only random errors, not systematic errors. 
If known, systematic errors should be corrected before 
an uncertainty analysis, which is easily possible, also 
for LCAs (Bevington (1994), p. 41; Ciroth (2001, pp. 
126). 

In order to discuss LCA in mathematical terms, some 
conventions and symbols must be chosen. Following the 
symbols introduced by Heijungs and Suh (2002), we use the 
following: 

• Bold lowercase letters (like z) for vectors, bold capital 
letters (like Z) for matrices, and italic letters (like z) for 
scalars 

• The product (and material, energy, ...) requirement and 
supply of unit processes are specified as a column vector 
a, in which an element a,- indicates the use or supply of 
product i 

• A positive coefficient a, means production, a negative 
coefficient means use, a zero value means that a product 
is not involved in a process 

• The environmental exchanges (the elementary flows) of 
unit processes are specified as a column vector b, in which 
an element bi indicates the use of resource / or the emis¬ 
sion of pollutant / 


• Different unit processes can be combined in an LCA 
system by horizontally concatenating column vectors into 
matrices A and B 

• The functional unit is defined in the form of a vector f, in 
which the reference flow (for product z) is represented as 
the only non-zero element f 

• The inventory result is defined as a vector g, of which an 
element g/ indicates the system-wide use of resource / or 
the system-wide emission of pollutant I 

2.2 Review of alternative error propagation formulations 

Even though the form of the function g(B, A, f) is an 
elementary question in the calculation of LCA, it has remark¬ 
ably little been addressed (Heijungs and Suh 2002). Most LCI 
and LCA textbooks do not discuss it or only superficially 
mention the issue without proposing a specific formula. 

Several solutions to this question have been provided by 
different authors. 

• Heijungs et al. (1992, 1994) use Cramer’s rule. 

• Heijungs (1996) and Heijungs and Suh (2002) use matrix 
inversion. 

• Ciroth (2001, 2004) uses a sequential calculation ap¬ 
proach. Due to complications with looped systems (which 
we believe is the default situation), this will not be 
discussed in this paper. 

• Hong et al. (2010) and Imbeault-Tetrault et al. (2013) do 
not discuss the form of g, but they do discuss the form of 
its derivative in the context of uncertainty propagation. 

These differences in approach naturally lead to differences 
in the fonn of the equation for g and hence in the equations for 
its derivatives with respect to B, A, and f, i.e., and ^. 

2.2.1 LCI according to Heijungs 

In a series of contributions, Heijungs et al. have developed 
explicit equations for the detennination of g and its derivatives. 

Already in the Dutch LCA guidebook (Heijungs et al. 1992), 
the question of calculating a life cycle inventory is presented as 
one of solving a system of linear equations. Under the usual 
assumptions of linear technology, market clearing (product bal¬ 
ance), steady state, an equal number of processes and products, 
etc., the model equations can be written as follows: 


where the bracket indicates that these two equations hold 
simultaneously. The first system of equations specifies the 
known f and A, and the challenge is to solve for the elements 
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of s, which are interpreted as the intensities, levels, or scaling 
factors of the processes in the system. The result can be fed 
into the second equation, to find g by straightforward 
multiplication. 

Heijungs et al. (1992) and Heijungs (1994) adopt Cramer’s 
rule to solve the first equation. In later publications, Heijungs 
et al. turned from Cramer’s rule to matrix inversion (Heijungs 
1996; Heijungs and Suh 2002). In these publications, the 
equations are 

s = A -1 f (6) 


and 

g = BA 1 f = Af. (7) 


Heijungs and Suh (2002, p. 133 ff.) provide equations for 
the derivatives. For instance, it is found that 

8 ^={^)^f)j = ^J ( 8 ) 


A similar formula is discussed for . The one for is 

abij 0/ j 

assumed to be zero, as the fmal demand vector is user-defined 
and free from error in most LCA studies. Inserted in the 
formula for error propagation [4] without covariance term, 
we easily find 

varfe.) = ^. ; (5yA fe ) 2 var(ay) + 2 var ((>*,•) (9) 

Heijungs (2010) provides a systematic overview of all 
derivatives that can show up in LCA. 

2.2.2 Input-output-based LCI 

During the past 10 years, economic IO techniques have made 
their way into LCA as an essential ingredient of hybrid 


methods combining IO-based and process-based methods 
(Suh and Nakamura 2007). The perceived advantage of this 
hybridization is that the best of both worlds—infinite up¬ 
stream coverage of IO and detail and specificity of process 
methods—are combined in one approach (Bullard et al. 1978; 
Moskowitz and Rowe 1985; Suh et al. 2004). Hybrid LCI 
methods have been described in detail by Heijungs and Suh 
(2002), Suh (2004), and Suh and Huppes (2005). Williams 
et al. (2009) propose a hybrid approach combining process 
and IO approaches to LCI uncertainty analysis. 

10 data are, like process-based data, subject to un¬ 
certainty. Below, we will elaborate the IO setup for an 
industry-by-industry monetary table in coefficient form, 
with an emphasis on the connection with uncertainty 
propagation. Other setups (product-by-product, physical, 
transaction form) are easy to derive from the base case. 

Let one element of the environmental repercussions b (e.g., 
sectoral CXL emission) of an arbitrary functional unit vector f 
(e.g., a household’s final consumption) defined as g=b'Lf= 
b'(I-A) *f, where L=(I-A) 1 is the Leontief inverse of the 
direct requirements or technical coefficient matrix A. The 
matrix A here contains only input data (as positive numbers), 
while A in the process-based setup represents both inputs 
(negative) and outputs (positive). The implicit output of every 
sector is 1; hence, the “I” and the sign reversal explains the 
in front of the A. 

Ignoring variations in b and f, the Taylor series of g as a 
function g(A) is 

g(A)~?(A 0 ) + (Ay-Ao^^MAo) 
i] 'J 

+ ^EEMo, f ) i A k,~ A o ,«)ap^ (Ao) 

ij k,l y 

= g(Ao) + f (A-A 0 )#Dg (A 0 ) 1 + (A-A„)#D 2 g(A„)#(A-A 0 )l, 

( 10 ) 

where # denotes the element-wise product, and 1*=(1,1,_,1) 

is a suitable summation operator. Dg(A 0 ) is called the gradient 
of g at A 0 , and D 2 g(A 0 ) is called the Hessian of g at A 0 . For 
g( A) as above, we find for example 


dg_ 

dA.j 


d(b'f + b'Af+ b'A 2 f+ b'A 3 f + ... 


dAij 


= 0 + bifj + [(b'A),/, + bj(Af )yj + [(b'A 2 )/, + (b'A) ,(Af ). + h,-(A 2 f). 
(b'A 3 )/, + (b'A 2 ) .(Af)I, + (b'A),.(A 2 f). + bj (A 3 f) ’ 

^ (A 2 -t—M-D 


+ v--- /„ , . V— »•'*“// • X '!\ /j ■ -*\ /j 

bi + (b'A),. + (b'A 2 ),. + ... f j+ (Af) J + (A 2 f) + .. .1 = (b'L),.(Lf) 


( 11 ) 
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and 


{Dg(Ao)} (/ - 


b*(I—Ao)" 1 ! [(I-Aor'f 


J J 


so that up to first order 


( 12 ) 


The terms with derivatives in Eq. [18] can be worked out 
with the derivative of the In: 


dlnd x' 8z’ 
Slav' z' fix' 


(19) 


g(A)-g(A 0 )=^ ( A v~ A o,ij) 
ij 


b'(l-Ao )^ 1 . (I-Ao) _1 f 


(13) 


g can also be understood as a function of the input-output 
intermediate transactions matrix T=Ax (where x is total use 
by sector, see Section 4.3), and a similar approach can be 
taken for this option as well. With 


Further, the standard deviation of a stochastic variable x, 
SD(x), is related to its geometric standard deviation, GSD(e*), 
by 

SD(x) = ln(GSD(g*)) = ln(GSD(x')). (20) 


Inserting this for SD(x) and SD(y) yields 


dg_ dg_ dTj ^ dg _ (b'L),-(Lf) y 

dAy dTjj dAjj dTj x ; 


(14) ln(GSD(z')) 2 = 5 ZjX ln(GSD(x')) 2 + 5 ZjJ ,ln(GSD0')) 2 . (21) 


we find the first-order approximation of g(T) 


g(T)-g(To)=]T (T„ /is.;/) 


b' IT 0 x 


I-T 0 x 


(15) 


Similar relationships can be derived for b and f. Note that b 
and f can also be multi-row and multi-column matrices, 
handling error propagation simultaneously for multiple 
interventions and functional units. 


This is still a general form. For the case of LCA, the more 
specific expression 

ln(GSDfc)) 2 = . ‘S'jfc.ijin(GSD (ciij )) 2 + S*jln(GSD(^)) 2 

( 22 ) 

appears. The formula still contains two relative sensitivity 
coefficients, defined by 


Sk,i,j = 


dgk a v 
d°ij gk 


and Skj = 


dgk bkj 
Sbkj g k ' 


(23) 


2.2.3 LCI according to Hong 


A completely different approach is the one taken by Flong 
et al. (2010). These authors do not start from the equation for 
LCI to calculate the derivatives and insert them into the Taylor 
formula. Rather, they essentially skip the LCI equation and 
they even do not calculate explicit derivatives. Their approach 
is based on the idea that most data in LCA follow a log-normal 
distribution. Ignoring the covariance, the first-order Taylor 
series approximation is 


var(z)= 



2 

var(x) 



(16) 


By log-transforming all variables 

x = lnr';_v = lnv';z = lnz' (17) 

this yields 

var(lnr> (i^) var(tav ' ) + (t^) var < hn '') <l8 > 


These coefficients can be calculated using the analytical 
results ofHeijungs (2010) or by repeated calculation, approx¬ 
imating, for instance 


_ A g k aj 
Aa y g k 


(24) 


with A ay= 1 %. 

According to Hong et al. (2010), the formulas apply “to the 
case where both input and output are log-normally distribut¬ 
ed.” It is, however, unclear where this assumption is needed in 
the derivation above. A similar approach is taken by Imbeault- 
Tetrault et al. (2013), who list and discuss the assumptions: a 
multiplicative model, log-normally distributed input parame¬ 
ters, and independently distributed input parameters. We agree 
with these authors that the model is not multiplicative but with 
an even further-going argument: a matrix inverse resembles a 
division more than a multiplication. But in contrast to 
Imbeault-Tetrault et al. (2013), we do not see the need for 
making the assumption of log-normality because the formula 
for Gaussian error propagation nowhere makes restrictions on 
the shape of the distribution. Finally, we do agree with their 
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assumption of independence of input distributions, not be¬ 
cause it is true (it is not) but because the uncertainty model 
would become too complicated and the data demand would 
become too large. Because we disagreed with the assumption 
of log-normality, we have not further pursued this approach 
any further. In a follow-up, a comparison between different 
analytical approaches may be carried out. 

2.3 Some key metrics used in analytical error propagation 

2.3.1 Non-comparative metrics 

The approach described in Section 2.2 can be followed for 
every environmental flow, for every alternative. For each of 
these, it does not return a distribution. It only delivers a 
variance var(z) of some variable of interest as basic material. 
This variance can be used to calculate the standard deviation 


sd(z) = \J var(z) 


and the coefficient of variation 


CV(z) 


sd(z) 
m (z) ’ 


(25) 


(26) 


where we use the (only) value of z for the expectation 
value of z: 

m(z ) = z. (27) 

Because we do not know anything of the distribution of the 
series of results z, it is impossible to calculate a 95 % confi¬ 
dence interval, a significance level, etc. That is different for 
Hong et al. (2010), who assume that the output distribution in 
LCA is log-nonnal. By introducing this assumption and cal¬ 
culating a measure of dispersion (in their case, a geometric 
standard deviation), they can calculate a confidence interval 
and test the null hypothesis that the true value is zero. 


2.3.2 Comparative metrics 

Many LCA studies are comparative, that is, they aim to 
compare the environmental performance of two products 1 
and II or even of a larger set of products I, If, 111, etc. 
(Lenzen 2006). As discussed by several authors (see, e.g., 
Hong et al. 2010), there is an issue in the case of comparative 
LCA. When we wish to compare two alternatives, I and II, we 
are interested in the value of a certain result, say z, for both 
systems. In a traditional LCA, without uncertainties, we cal¬ 
culate Z| and z u and see which one is higher. When we 
calculate the uncertainties with the analytical method, we find 
in addition a measure of the dispersion of the central values for 
I and II. But we cannot test whether the difference is in any 


sense significant or not. Again, the assumption of log- 
normality by Hong et al. (2010) and Imbeault-Tetrault et al. 
(2013) opens new vistas. These authors build on the compar¬ 
ison indicator by Huijbregts (1998), defined as 


CI UI (z) = X (28) 

zii 

As Zi and z u are assumed to follow a log-normal distribu¬ 
tion, Cl follows this distribution as well. With the additional 
assumption that Zj and z u are independent, it follows that 

ln 2 (GSD(CI M i(z))) = ln 2 (GSD( Zl )) + ln 2 (GSD(z n )) (29) 

so that a measure of dispersion (the GSD) of the comparison 
indicator can be obtained. Unfortunately, when Zx and z u are 
conceived as independent variables, the ratio Cli n has a 
higher uncertainty than those of the contributing items, where¬ 
as our conjecture is that the comparison will have a lower 
uncertainty in practice, due to the fact that two scenarios I and 
II share a background system with common uncertainties 
(cf. Eq. (6) in Imbeault-Tetrault et al. 2013). 

2.3.3 Contribution to variance 

The uncertainty of a result can be conceived as being built up 
by uncertainties of the parameters. Some of these parameters 
are fairly certain, but even when all parameters are equally 
uncertain, some of them will have a higher influence on the 
uncertainty of the output than others. An analysis of the 
contribution to variance (CTV) decomposes the uncertainty 
in its contributing uncertainties. This provides important in¬ 
formation for efficiently reducing the uncertainty. 

Given the basic formula [4] neglecting covariance 

var(z)^g) var(x) + var(y) (30) 

we can easily express the contribution to variance of z by the 
uncertainty in.r, CTV(zpc), as 

CTV(z,x) = var(x), (31) 

and similarly for CTV(zjj. For the case of LCA, following 
Heijungs’ approach, ones finds 

CTV (g k , aj) = (5 7 A fa ) 2 var(flx,) (32) 

and 

ClN(g k ,b kj ) = {sj)\ ar(^) (33) 
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Typically, 

CT V(g*,/ f ) = 0 (34) 

because the functional unit is a fixed number. 


3 Sampling methods for error analysis 

Sampling methods are based on repeatedly running the model 
with a sample of input parameters, to create a sample of model 
results. This sample of model results can be analyzed to 
extract uncertainty indicators. This section introduces differ¬ 
ent aspects of the sampling method: the main idea of Monte 
Carlo sampling (Section 3.1), some metrics that can be de¬ 
rived from a sample of results (Section 3.2 for the non¬ 
comparative case and Section 3.3 for the comparative case), 
some details on the implementation of Monte Carlo sampling 
in LCA (Section 3.4), and the possibilities for extracting 
contribution to variance information from a sample of results 
(Section 3.5). 

3.1 Monte Carlo analysis 

Suppose that we have specified all input parameters as distri¬ 
butions, e.g., the coetficients w and h in the example on the 
sheet of paper (Section 2.1.2) are not fixed numbers but are 
defined in tenns of probability distributions with specified 
type and parameters. In one run, i, we draw all parameters 
from the specified distribution, defined by (w,h),. 

We calculate quantities of interest, in this case A i ='M i ^h i 
(see Eq. [2]). Repeating this N times, we obtain a sample {A Y , 
A 2 ,"',A N }. This is the basic idea of Monte Carlo sampling. 
The method assumes that the probability distribution is sam¬ 
pled in a representative way. This may require a substantial 
number of runs. 

In the case of LCA, the procedure is similar. Here, the input 
data consists of the matrices A and B and the output of the 
vector g (see, for instance, Eq. [7]). The Monte Carlo sample 
obtained is thus {gi,g 2 ,-",gw}. 

More sophisticated methods exist to explore the sampling 
space in a more informed way. Using structural path analysis 
(Crama et al. 1984; Defoumy and Thorbecke 1984; Lenzen 
2002; Suh and Heijungs 2007) or the path exchange method 
(Treloar 1997; Lenzen and Crawford 2009), an LCI system 
can be reduced to its most significant pathways and nodes, 
resulting in a much simpler and more rapid sampling task. We 
have not explored these other techniques in this paper. 

Monte Carlo techniques have been used in economic in¬ 
put-output analysis for at least 50 years (Quandt 1958; 1959) 
and in environmental input-output analysis for at least 
30 years, recently including hybrid LCA (Bullard and 


Sebald 1977, 1988; li 2000; Sakai et al. 2000; Nansai et al. 
2001; Yoshida et al. 2001; Yoshida et al. 2002). These tech¬ 
niques can be applied to a number of error categories, such as 
source data uncertainty and aggregation error, amongst others 
(Lenzen 2000). 

In carrying out a Monte Carlo analysis and in analyzing its 
results, we should distinguish two cases: 

• The one-sample or independent sample case, where a 
result is calculated for one product, or for several products 
but using new Monte Carlo runs 

• The multiple-sample case, where one Monte Carlo reali¬ 
zation is used to calculate an observation for several 
products 

In discussing the analytical approach above, we treated the 
comparative and non-comparative metrics separately. In 
discussing the sampling approach, we will do this again, but 
here, the options are richer because the samples may originate 
from independent or dependent stochastic runs. 

3.2 Non-comparative metrics calculated with the sampling 
approach 

A single sample of results may be analyzed by calculating 
various statistics. We subdivide these statistics into the family 
of parametric statistics and non-parametric statistics (Siegel 
1956). Table 1 shows some key statistics that can be derived 
from a sample of results for one alternative. Appendix 1 
provides the detailed formulas for these statistics from a 
sample of results. 

3.3 Comparative metrics calculated with the sampling 
approach 

We now examine the case in which two or more samples exist, 
so in which a comparison of two or more products is at stake. 
The samples for a specific variable can be written as {g^} 
and {gk,i}\ each of them is a series of values of length N. 
These samples can be used to calculate additional statistics 
and to test a number of hypotheses. 

For the resulting sample {g^i} and {gk,i}, e.g., the carbon 
footprint of options 1 and 2, we can calculate two combina¬ 
tions of interest, the difference 

{dk} = {gk,l,l~Sk,2,l^Sk,l,2~Sk,2,lA" igk,l,N~Sk,2,N} ■ (35) 

and the ratio 

{ r k\ = {&t,l,l/&fc 1 2,l)£*,l,2/&fc,2,2) "' 1 8k,lfl /Sk,2fl} ■ (36) 
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Table 1 Key statistics that can 
be calculated from a series of 

Information 

Parametric statistics 

Non-parametric statistics 

sampling results A 

Location 

Mean, m(A) 

Median, 02(4) 


Dispersion 

Standard deviation, sd(4) 

Interquartile range, 1QR(4) 


Relative dispersion 

Coefficient of variation, CV(4) 

Coefficient of quartile variation, CQV(4) 


Range 

95 % confidence interval, 0(4) 

Range, R(A) 


Distribution 

Agreement with specified distribution 
(Kolmogorov-Smimov statistic 
with z test) 



The two samples \d k ) and {r k } can be used to calculate the 
same statistics as listed in Table 1 . In addition, tests of equality 
and correlation can be carried out; see Table 2 for some main 
statistics of interest. 


Monitoring their convergence as N increases provides a 
clue to the sufficiency of the number of runs. 

3.5 Contribution to variance 


3.4 Monte Carlo and LCA 


The theory of Monte Carlo analysis is easily applied to LCA. 
In each run i, we draw all parameters from the specified 
distribution (B; A;f), and calculate quantities of interest, such 
as (g;s), : . We then subject these quantities to statistical analy¬ 
sis, by means of parametric or non-parametric statistics, using 
them for independent or dependent samples. 

There are, however, certain issues to address. First, Monte 
Carlo is based on the repeated calculation of results. But if one 
calculation takes 1 min, repeating the calculation 10,000 times 
becomes time-consuming. Speed of calculation is therefore an 
important criterion to address. 

Second, the number of Monte Carlo runs requires consid¬ 
eration. Morgan and Henrion (p. 200 ff.) discuss this issue. We 
can use the sample of results to study the convergence. For 
instance, we can study how the mean value m, the standard 
deviation (sd), or the coefficient of variation (CV) develops as 
a function of the sample size N: 


m(z ; N) 


( Zi ~ m 0; N )) 2 ; CV(z; N) 


sd (z;N) 
m(z ; N) 

(37) 


For the contribution to variance, the analytical method could 
employ explicit formulas for adding components to an overall 
variance. For the sampling methods, this will not work. Flere, 
approaches that correlate the sample of input values with the 
sample of output values are available. 

Another approach to calculating CTV is the one proposed 
by Geisler et al. (2005). Here, the correlation between each 
input parameter and the output parameter is calculated and 
used to partition the output uncertainty: 


CTV (z, x) 


> 2 {z,x) 

r 2 (z,x) + r 2 (z,y) : 


(38) 


where in the case of Geisler et al. (2005) r refers to the rank 
(Spearman) correlation coefficient. Alternative approaches 
(Saltelli et al. 2001) are based on the Pearson correlation 
coefficient. 


4 Three case studies 

This section shows the results for three case studies. 
Section 4.1 is a small illustrative example, of which all details 
can be described in detail within this article. Section 4.2 is a 
large and realistic process LCA example, using the ecoinvent 


Table 2 Statistics that can be calculated from two series of sampling results A and B. The null hypothesis is always that the values of the two samples are 
equal and the alternative hypothesis that they are unequal. The p statistic is therefore always based on a two-sided test 


Information Parametric statistics Non-parametric statistics 


Independent Dependent 


Independent 


Dependent 


Centrality 

m(A)=m(B) (t test) 

m(A)=m(B) (paired t test) 

Qi{A)=Q 2 {B) (Mann-Whitney 
with z test) 

Q 2 (A)=Q 2 (B) (Wilcoxon paired 
signed-rank with z test) 

Correlation 

— 

r(AJ3)=0 (Pearson correlation 
with t test) 


r s (A,B )=0 (Spearman correlation 
with t test) 


Springer 














1454 


Int J Life Cycle Assess (2014) 19:1445-1461 


v2 data with their uncertainties. Section 4.3 is a simple but 
large hybrid 10 LCI example. Given that process analysis 
featured in the first two examples, emphasis is placed here 
on the 10 part, and only minimal process information is 
included. 


4.1 Illustrative example 


For the first case, we refer to the example by Heijungs and 
Suh (2002, p. 14). This is an inventory with two processes, 
two products, and three elementary flows. The matrices are 
given by 


A = 




(39) 


All process inputs and emissions were given a normal 
distribution with a coefficient of variation of 10 % (Table 3). 

In order to test the importance of the assumption of nor¬ 
mality or log-nonnality, we repeated the calculation, now 
taking log-normal distributions with a geometric standard 
deviation of 1.3 (Table 4). 

All computations can be done very quickly. Even the 
100,000 Monte Carlo runs were finished in just a few seconds. 
As we see, both for normally distributed and for log-normally 
distributed data, the computations for analytical and sampling 
coincide very well. Our choice of presentation makes it once 
more clear that the Monte Carlo sampling approach can access 
more statistics (mean, quartiles, etc.) than the Taylor series 
approach, which can only calculate the standard deviation 
(and the variance of course). 

We also used both methods to calculate the contribution to 
variance. Table 5 shows the results for the normally distribut¬ 
ed uncertainties, using the analytical methods and the Monte 
Carlo method with 100,000 runs, both with a rank (Spearman) 
correlation (Geisler et al. 2005) and a normal (Pearson) cor¬ 
relation (Saltelli et al. 2001). 

Again, the results are very similar. Each of the three 
methods assigns approximately equal contributions to the 
different input uncertainties. 


Table 3 Statistics for the uncertainty of a very small LCA system with 
normally distributed uncertainties, computed by 100,000 Monte Carlo 
(MC) mns and by the analytical method (Taylor) 


Flow 

Baseline 

m (MC) 

sd (MC) 

02 (MC) 

IQR (MC) 

sd (Taylor) 

C0 2 

120 

120 

10.4 

119 

18.2 

10.4 

so 2 

14 

14 

1.15 

13.9 

2.02 

1.15 

Crude 

-100 

-100 

14.1 

-100 

24.7 

14.1 

oil 








Table 4 Statistics for the uncertainty of a very small LCA system with 
log-normal distributed uncertainties, computed by 100,000 Monte Carlo 
(MC) mns and by the analytical method (Taylor) 


Flow 

Baseline 

m (MC) 

sd (MC) 

02 (MC) 

IQR (MC) 

sd (Taylor) 

C0 2 

120 

120 

13.7 

119 

18.2 

13.7 

so 2 

14 

14 

1.51 

13.9 

2.02 

1.51 

Crude 

-100 

-100 

18.7 

-100 

24.7 

18.6 

oil 








4.2 Large process analysis using the ecoinvent database 

The second case employs the ecoinvent database. The 
matrix A has size of 4,087 x4,087, while B is 3,795 x 
4,087. The ecoinvent consortium has moreover added 
impact categories to their data. Altogether, characteriza¬ 
tion factors for 672 impact categories are organized in a 
characterization matrix of size 672x3,795. We defined 
three alternatives, with reference flows “1 pkm trans¬ 
port, passenger car, RER,” “1 pkm transport, aircraft, 
passenger, RER,” and “1 pkm transport, high speed 
train, DE” and concentrated on the impact “kg C02-Eq 
IPCC 2007, climate change, GWP 100a, GLO.” The 
data in ecoinvent comes with estimated uncertainties 
for almost all parameters. A total of 92,284 uncertainty 
distributions on A and B (out of 135,892, so 2 out of 3) 
were used, most log-normal. In addition, mock uncer¬ 
tainties were introduced on the characterization factors: 
normal with a CV of 10 %. The results for the three 
transport alternatives are in Table 6. 

Again, the results of the analytical and the Monte Carlo 
approach are in good agreement with one another, in particular 
for the aircraft and train alternatives. For the car, the difference 
is larger, but we still judge the order of magnitude very 
comparable. 

The analytical calculation finishes in a few minutes. This is 
in stark contrast with the Monte Carlo, which requires be¬ 
tween 10 s and 1 min per run, depending on hardware and 
algorithm. Running 1,000 simulations thus takes between 2 
and 16 h. Although 1,000 runs might be argued to be on the 
small side, the results are pretty stable and moreover in agree¬ 
ment with those of the analytical method. We did not try to go 
beyond the 1,000 runs. 

We also did not complete a comparison of the contribution 
to variance by the two methods. With the analytical methods, 
the results were obtained in a few minutes. After 1,000 runs, 
the agreement with the Monte Carlo approach was very poor, 
and in fact, the CTVs showed to be very unstable, changing 
substantially from experiment to experiment. The correlations 
between the inputs and the output appear to require a much 
larger sample size than 1,000 to give a robust contribution to 
variance estimate. So, even though the uncertainty of the 
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Table 5 Contribution to variance (CTV) on the basis of 100,000 Monte 
Carlo (MC) runs, using the Pearson correlation coefficient and the rank 
(Spearman) correlation coefficient and using the analytical method 
(Taylor) 


Process 

Flow 

CTV (MC, 

CTV (MC, 

CTV 



Pearson) (%) 

Spearman) (%) 

(Taylor) 

(%) 

Electricity 

Fuel 

4 

4 

4 

production 

Electricity 

C0 2 

93 

93 

92 

production 

Fuel 

C0 2 

3 

4 

4 

production 






result itself seems to be reliable with 1,000 runs, a contribution 
to variance seems to require many more runs. 

4.3 Example using a hybrid IO-based LC1 

The third case study is more extensive than the previous ones. 
It is different in several respects: it is IO-based instead of 
process-based, the functional unit is a consumption basket 
specified as the final demand block of the hybrid IO system, 
and an analysis of the effect of a first-order approximation is 
included. 


4.3.1 Description of the system 

In this example, we set up a closed input-output system 


T 

y 

V 

0 


0 


1 


B 


where T is a regional industry-by-industry input-output 
transaction table for Australia, and the vectors v and y 
describe the income earned and the products bought, 
respectively, by a hypothetical but typical Sydney 
household, v and y represent a single process integrated 
with the IO system, in input-output parlance, this sys¬ 
tem is termed “closed” because the usually exogenous 
income and expenditure vectors are endogenized into 
the intermediate transaction matrix. The alignment to 
the nomenclature introduced in Section 2 is via the 


intervention matrix B and the function unit f, as fol¬ 
lows: The vector f* = [0|l]' selects and extracts one unit 
of one process (family metabolism) as the functional 
unit from the compound IO transaction matrix T* = 


The matrix B holds the environmental inter¬ 


ventions, as usual. T is sized 2,752x2,752 sectors (data 
sources are described in Gallego and Lenzen 2009). 
As a consequence, the extended matrices T* = 


T y 

v 0 


, A* = T*x* 1 , and L* = (I-A*) 1 are sized 


2,753x2,753 sectors. The extended total use vector 


is x* = 


, with z=yl=lv being the budget of the 


household and x=Tl+yl being gross output of the 
economy. The satellite account B (content and data 
sources listed in ISA 2010) includes a row of total 
greenhouse gas emissions by industry, which we will 
employ in this case study to calculate the environmental 
intensity vector b‘ (see Section 2.2.2). We ignore direct 
household emissions and set b t * = [b‘|0]. y holds the 
expenditure on products by the average Sydney family, 
with a carbon footprint of about g 0 z=52 tonnes of 
CC> 2 -equivalents (data sources and Global Warming 
Potentials described in Lenzen and Peters 2010). 

In our Monte Carlo experiments, we follow Hong et al. 
(2010) in specifying standard deviations A[logio(T)] in the 
logs of the input-output transactions matrix T and applying 
log-normal perturbations, i.e., T = l0 logl| f T ) ±i: A [ lo s 10 ( T )] w ith 


ks [0,1], so that the elements of T can never become negative. 
For the sake of simplicity in this example, we assume that the 
environmental interventions If and the “process” vectors v and 
y and therefore also household budget z are known exactly. We 
also assume that total use x is certain (x is usually very large and 
therefore associated with much smaller uncertainties than the 
elements of T; cp. Bullard and Sebald 1977; 1988) so that the 


Monte Carlo perturbation only involves A 


T y 

v 0 


X* 1 , L* = (i-A*) 1 andg = b^Iif . 

A* is a compound matrix consisting of distinct sub¬ 
matrices. Since IO theory is usually formulated in terms 
of these sub-matrices, we will transform g = b l C f so 
that g is a separate function of T and y. Utilizing 
Miyazawa’s partitioned inverse (Miyazawa 1966), we 
find that 


(I-A*) 


-1 


ITx 1 

-yz 1 

-1 

L^I + yz _1 Kvx _1 L^j 

Lyz ‘K 

^ —1 

vx 

I 


Kvx 1 L 

K 


(40) 
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Table 6 Statistics for the uncer¬ 
tainty of a very large LCA system, 
computed by the analytical meth¬ 
od and by 100 Monte Carlo runs 


Alternative 

Baseline 

m (MC) 

sd (MC) 

02 (MC) 

IQR (MC) 

sd (Taylor) 

Car 

0.182 

0.181 

0.0498 

0.0173 

0.0538 

0.0557 

Aircraft 

0.126 

0.126 

0.00563 

0.126 

0.00862 

0.0056 

Train 

0.0637 

0.0649 

0.00693 

0.0652 

0.0104 

0.00657 


withL = fl-Tx -1 ^ andK = ^ I~vx _1 Lyz -1 ^ .Consid¬ 
ering that the household’s budget is much smaller than the 
output of the economy, and therefore ||v||«||x||, we find that 
K~I, and 


(i-aT 


^-1 ~ 

n L 


Lyz 


-i 1 


(41) 


With f* = [0|l]' and e* = [e|0], we arrive at g~b‘Lyz 1 , 
where term yz 1 contains the expenditure shares vjz in the 
total household budget z, nicely normalized to 1 as set in the 
functional unit. 


4.3.2 Departure of Taylor approximation from the true inverse 


First, we illustrate the departure of the first-order Taylor 
approximation (see Section 2.2.2) from the exact inverse 
formula, by plotting both the variations of the inverse-based 
carbon footprint 


Ag Inv 


g( T )-go 

-(labeled Inverse ) 

go 


(42) 


and the variations of the carbon footprint based on a Taylor 
series approximation 


Ag Taylor = V. (f,y To.ij) 


b'( I“T 0 x 


go 


I“T 0 x 


y= 


,-i 


(labeled ‘Taylor’) 


(43) 


into the same diagram (Fig. 2). 

Up to about k=±'A, or T = io lo &o( T )±| A[io gl0 (T)] , the 
Taylor approximation gives reasonable results; how¬ 
ever, beyond \k\>'/ 2 , deviations become significant to 
the extent that the uncertainty in g is underestimated 
for k> 0 and overestimated for k<0. This result can 
be explained by the fact that the Taylor approxima¬ 
tion neglects terms of second and higher order, and 
hence, the curvature of the Taylor approximation is 
smaller. 

In the calculation above, all elements of T departed 
from those of T in one and the same direction. In 
reality, the “true” values of some elements in T may 
be larger than the nominal values, and others may be 
smaller, so that effects on the overall uncertainty of g 
will cancel out. This behavior can either be simulated 
using Monte Carlo analysis or analytically approximat¬ 
ed using the error propagation formula given in 
Section 2.2.2. 


4.3.3 Analytical approach to uncertainty 

The analytical approximation proceeds as follows: We start 
with the variance fonnula from Section 2.2, ignoring covari¬ 
ance tenns: 


var[g]=^..Q^) var [ lo gio( r !/)]- ( 44 ) 


Starting with our earlier approximation g ~ If Lyz 1 , we 
express g in terms of log 10 (T) as 


g(log 10 (T))=b^I-10 lo ^mf^ yz- 1 . (45) 
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variation parameter k in T = K) lo 8io( T )±* A i logl0(T)l 


4.5 

4 


3.5 



log io (T i/) 

Fig. 3 Data for var[logio(7),j] plotted against log 10 (7),) (data taken from 
Gallego and Lenzen 2009; Lenzen et al. 2010) 


2,752 x 2,752-sized system takes about 20 s on an off-the-shelf 
The derivative in the variance formula can be worked out laptop equipped with a 1.6-GHz Intel Core, 
by applying the chain rule; 


dgfa) 

dlogio(TV) 


dg(7V) a(l0 log '°(^)) dg(Tjj) je H ' 0) '° glo( ^) 
dTtj Slog 10 (r s ) 8T tj d\og l0 (T,j) 


b‘ I-Tx 


I-Tx 


-ln(10 )Ttj 


(46) 


Inserting into the variance formula yields 


4.3.4 Sampling approach to uncertainty 

The sampling approach proceeds as follows: Applying Monte 
Carlo simulation requires the repeated calculation of T = 
1 0 lo s 10 ( T )+*t/var[logic( T )1 ; w h e re Ate [-1,1] is a normally 

distributed random variable, calculating g ~ tf^I-Tx ^ y 

z _1 from T , plotting the distribution of = SZSz ; 

and finally deriving var[g] from the mean of this distribution. 


/[b‘(I—Tx-yj.fC 


I-Tx 


—ln( 10) 7”/, 




= E,[( b ‘ L )«( L y)y ln ( 10 H var [ 1 °glo( 7 ’y)] • 


\ 


/ 


var[log 10 (r,j)] 

(47) 


Standard deviations and coefficients of variation can be 
calculated from the above expression for the variance accord¬ 
ing to formulae derived above. The input data for the term 
var [login (Ty)] (Fig. 3) were computed by applying a 
RAS-type approach to the variance of primary data used to 
construct the 10 system (for further details on the RAS 
approach, see Gallego and Lenzen 2009). In this RAS ap¬ 
proach, an error propagation formula is fitted to data for the 
standard deviations of primary data, as a function of the 
standard deviations of the IO table elements T t j. Figure 3 
shows that variances are large for small elements T tJ and vice 
versa (compare with Lenzen et al. 2010). 

Inserting all data, the Taylor approximation yields 

CV(g) = = 2.18 % . The calculation for the 



Deviation in C (%) 

Fig. 4 Frequency distribution of relative deviations . Frequency 
intervals (“bins”) are 0.01 % apart, and the distribution peak occurs at 
^ = 2.06 % 
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Table 7 Summary of statistics 
that can be calculated from the 
results of a sampling approach 


The statistics that are also avail¬ 
able with the analytical method 
are presented in bold 


Sample Parametric statistics Non-parametric statistics 


One sample or independent samples 


Dependent samples 


Mean (in) 

Standard deviation (sd) 
Coefficient of variation (CV) 
Confidence intervals (Cl) 
p (mean=0) 

Mean difference (in,,) 

Mean ratio (m r ) 

P(/n d = 0) 


Median ( Q 2 ) 

Interquartile range (IQR) 

Coefficient of quartile variation (CTV) 
Range (R) 

P(02=O) 

Median difference (0 2 d) 

Median ratio ((7 2r ) 

P(£?2d=0) 


The result of 100,000 Monte Carlo runs on this system 

(Fig. 4) yield a mean for CV(g) = ^ = 2.06 % . 

One perturbation run of the 2,752x2,752-sized system takes 
just over 5 s on an off-the-shelf laptop equipped with a 1.6- 
GHz Intel Core, about the same time per element (0.6 (is) as 
for the slightly larger ecoinvent example when running for 
10 s (Section 4.2). One hundred thousand runs of this system 
took just over 6 days. This shows that even parallel execution 
of multiple cores will not significantly change the odds of 
Monte Carlo codes in terms of computer runtime. 

These results support our conclusion that an analytical 
approach will yield results that are very close (within about 
5 % in this case) to the sampling-based results, but that can be 
obtained in a vastly shorter time. 


5 Conclusions and discussion 

In this paper, we focused on a comparison of two classes of 
error propagation methods, the analytical and the sampling 
approach. We discussed their foundation and implementation 
in LCA, in terms of the data input needs, the formulas, and the 
types of output obtained, with an emphasis on the differences 
and similarities in performance. 

Our first conclusion is that the two approaches differ in 
terms of their required input data. The sampling procedure 
requires a specification of the probability distributions of the 
input data, for instance in terms of a normal or log-normal 
distribution. Such a specification requires a modeller or data¬ 
base provider to estimate such distributions for each parame¬ 
ter. Traditionally, users specify a central value or only a mean 
or median value, which is then typically interpreted as the 
most likely value. In specifying a probability distribution, a 
user needs to provide at least two more types of information: 
the shape of the distribution and the parameter or parameters 
describing that distribution. For instance, a user could specify 
the shape as normally distributed and a parameter for the 
standard deviation, the variance, or the SD95 (range within 
which 95 % of the points are located). In cases of the most 
widely used distributions (uniform, symmetric triangular, log¬ 


normal), one parameter for dispersion suffices. In the case of 
other distributions (non-symmetric triangular, beta, Weibull, 
etc.), additional parameters are needed. 

For the analytical approach, only the second moment of the 
distribution (i.e., the variance) is needed. This means that the 
distribution itself need not be known, and even if it were 
known that it would not need to be passed to the software. 
Thus, the data requirements are smaller for the analytical than 
for the sampling approaches. 

A second conclusion is that the propagation algorithms are 
the most distinguished features of the two approaches. The 
default approach for the analytical method, studied in this 
paper, is the Gaussian method, based on a first-order Taylor 
series approximation. The default approach for the sampling 
method is the Monte Carlo method. Both methods are easy to 
implement in software. 

A third conclusion is that the two methods provide different 
sets of information. The analytical method does not calculate a 
distribution but only yields the second moment (the variance) 
of the distribution. Of course, this gives an important clue to 
the confidence intervals, as it is the most basic indicator of 
dispersion. 

The sampling method returns a sample of results, from which 
many statistics can be calculated. In this paper, we discussed two- 
by-two classes of such statistics: parametric vs. non-parametric 
and independent vs. dependent samples (see Table 7). 

Next, in a numerical comparison, the two methods give 
similar results, at least to the extent that they yield shared 
indicators. Basically, the only shared indicator between the 
analytical and sampling approach is the variance (and thereby 
the standard deviation and coefficient of variation). Many of 
the statistics that are accessible by the sampling approach 
(median, interquartile range, etc.) cannot be obtained using 
the analytical approach. That is an important aspect for a user 
facing the choice between an analytical and a sampling ap¬ 
proach. Another distinguishing feature is that the analytical 
approach is based on a linearization that only holds when the 
error term is not too big, while the sampling approach may 
perform better, albeit with a larger number of runs. 
Covariances between random input variables can in principle 
be included in both approaches, but we conjecture that, except 
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for small and tailor-made systems, such covariances will be 
unknown, and their propagation will be unfeasible in practice. 

On the other hand, there is an enonnous performance 
difference in terms of computing time. For small systems 
(with a few unit processes), this difference is unimportant, 
but for larger systems (of the size of ecoinvent), it can mean 
days instead of minutes. More specifically, variances calculat¬ 
ed using sampling and analytical approaches for our small 
system agree with one another, but for the large system, the 
sampling method would take so much time that an analytical 
approach seems most appropriate. 

The take-home message is subtle. Sampling methods give 
access to more types of information (because an entire distri¬ 
bution is calculated instead of just two moments of the distri¬ 
bution) than analytical methods do, but they require more 
information and (sometimes much) more computer time. 
Most, but not all, key information can more rapidly be ex¬ 
tracted using an analytical method. Tests of significant differ¬ 
ences between life cycle scenarios, one of the most critical 


results obtained from LCA studies, have so far only been 
undertaken with sampling methods. Our work shows that 
the comparative advantages of analytical methods would en¬ 
able such tests to be carried out at much reduced expense, 
ultimately leading to enhanced life cycle information. At the 
same time, it should be stressed that uncertainty analyses have 
intrinsic limitations. They should be handled with care, and 
they must be supplemented by sensitivity analyses. 
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Appendix A: Formulas for sampling statistics 


Table 8 The sample of size N is 
indicated as {A i } i-l ^{Ai,..., 
A N }. The ordered sample is indi¬ 
cated by \Aj\ 

l J i=l,...JV 


Statistic 

Symbol 

Formula 

Mean 

m 

m{A) = T X A t 

i= 1 

Variance 

var 

var 0O = F=lI iAi-mf 

1=1 

Standard deviation 

sd 

sd(^4)5 = \/var (A) 

Coefficient of variation 

cv 

C V(A)= s $j 

95 % confidence interval 

Cl 

Cl(A)=[m(A)~ 1,96sd (A),m(A)+ 1.96sd(rt)] 

Median 

Qi 

O (A) = I "Vi 1 )/ 2 , N odd 

2 \ (Av /2 + ^v/2+i)/2 N even 

First quartile 

01 

ou)^ i Q2{ii}i = i""’ ( * +i )/ 2 nom 

L ' ( ) ‘ j 82{A,} i=1 . N/2 N even 

Third quartile 

a 

0 (A) - l 2 2 m i= (jv +1)/2 ,...,iv N odd 

3 1 Q 2 { A i} i=N/ '2,..-A A,even 

Interquartile range 

IQR 

iqr 

Coefficient of quartile variation 

CQV 


Range 

R 

R(A)=A N -A 1 

Mean difference 

m d 

m c j{A,B)-m{A)-m(B) 

Mean ratio 

m r 

m r (A,B)=j r £ A/Bt 

1=1 

Correlation 

r 

..r ^ d) _ Zili(A-«(4)(*r-»>(*)) 

\J X/ll \J Zill (. Bi-m(B)) 2 

Rank correlation 

r s 

.. r o) _ lit(A-a0»(4-aw) 

y^il (A-& (BrQMf 
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